36 Generalizability Theory

نویسندگان

Richard J. Shavelson

Noreen M. Webb

چکیده

Generalizability (G) theory is a statistical theory for evaluating the dependability (or reliability) of behavioral measurements (Cronbach, Gleser, Nanda, & Rajaratnam, 1972; see also Brennan, 2001; Shavelson & Webb, 1991). G theory permits the researcher to address such questions as: Is the sampling of tasks or judges the major source of measurement error? Can I improve the reliability of the measurement better by increasing the number of tasks or the number of judges, or is some combination of the two more effective? Are the test scores adequately reliable to make decisions about the level of a person’s performance for a certification decision? G theory grew out of the recognition that the undifferentiated error in classical test theory (Feldt & Brennan, 1989) provided too gross a characterization of the potential and/or actual sources of measurement error. In classical test theory measurement error is undifferentiated random variation; the theory does not distinguish among various possible sources. G theory pinpoints the sources of systematic and unsystematic error variation, disentangles them, and estimates each one. Moreover, in contrast to the classical parallel-test assumptions of equal observed-score means, variances, and covariances, G theory assumes only randomly parallel tests sampled from the same universe. Finally, whereas classical test theory focuses on relative (rank-order) decisions (e.g., student admission to selective colleges), G theory distinguishes between relative (“norm-referenced”) and absolute (“criterion-” or “domain-referenced”) decisions for which a behavioral measurement is used. In G theory, a behavioral measurement (e.g., a test score) is conceived of as a sample from a universe of admissible observations. This universe consists of all possible observations that decision makers consider to be acceptable substitutes (e.g., scores sampled on Occasions 2 and 3) for the observation in hand (scores on Occasion 1). A measurement situation has characteristic features such as test form, test item, rater, and/or test occasion. Each characteristic feature is called a facet of a measurement. A universe of admissible observations, then, is defined by all possible combinations of the levels of the facets (e.g., items, occasions). Consider a generalizability study of students’ scores on a measure of academic self-concept. Suppose students (persons) responded to three self-concept items randomly selected from a large domain of such items on each of two randomly selected occasions (see Table 36–1). The items

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Score Generalizability of Writing Assessment: the Effect of Rater’s Gender

The score reliability of language performance tests has attracted increasing interest. Classical Test Theory cannot examine multiple sources of measurement error. Generalizability theory extends Classical Test Theory to provide a practical framework to identify and estimate multiple factors contributing to the total variance of measurement. Generalizability theory by using analysis of variance ...

متن کامل

The Life Giving Properties in the Structure of the Ganjali-Khan Square in Kerman based on Alexander’s Theory of Order

In studying the case studies of traditional architectures according to Christopher Alexander’s theory about “the nature of order” and the fifteen fundamental properties introduced, it is important to note that the ontology of the theory is based on human’s indigenous feeling about architecture, which subsequently implies these kind of studies to be based on people’s cognitive images induced fro...

متن کامل

Generalizing Generalizability in Information Systems Research

The concept of generalizability is not homogeneous and monolithic, but can be analyzed into four types: the generalizability of a theory to different settings, the generalizability of a theory within a setting, the generalizability of a measurement or observation, and the generalizability of a variable, construct, or other concept. In this study, we affirm the legitimacy of the statistical, sam...

متن کامل

Chapter 1 Generalizability Theory and Item Response Theory

Item response theory is usually applied to items with a selected-response format, such as multiple choice items, whereas generalizability theory is usually applied to constructed-response tasks assessed by raters. However, in many situations, raters may use rating scales consisting of items with a selected-response format. This chapter presents a short overview of how item response theory and g...

متن کامل